AITopics | convolution and attention

Collaborating Authors

convolution and attention

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Revisiting the Integration of Convolution and Attention for Vision Backbone

Neural Information Processing SystemsMay-27-2025, 00:33:07 GMT

Convolutions (Convs) and multi-head self-attentions (MHSAs) are typically considered alternatives to each other for building vision backbones. Although some works try to integrate both, they apply the two operators simultaneously at the finest pixel granularity. With Convs responsible for per-pixel feature extraction already, the question is whether we still need to include the heavy MHSAs at such a fine-grained level. In fact, this is the root cause of the scalability issue w.r.t. the input resolution for vision transformers. To address this important problem, we propose in this work to use MSHAs and Convs in parallel \textbf{at different granularity levels} instead. Specifically, in each layer, we use two different ways to represent an image: a fine-grained regular grid and a coarse-grained set of semantic slots.

artificial intelligence, convolution and attention, vision backbone, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.40)

Add feedback

On the interplay of adversarial robustness and architecture components: patches, convolution and attention

Croce, Francesco, Hein, Matthias

arXiv.org Artificial IntelligenceSep-14-2022

In recent years novel architecture components for image classification have been developed, starting with attention and patches used in transformers. While prior works have analyzed the influence of some aspects of architecture components on the robustness to adversarial attacks, in particular for vision transformers, the understanding of the main factors is still limited. We compare several (non)-robust classifiers with different architectures and study their properties, including the effect of adversarial training on the interpretability of the learnt features and robustness to unseen threat models. An ablation from ResNet to ConvNeXt reveals key architectural changes leading to almost $10\%$ higher $\ell_\infty$-robustness.

artificial intelligence, machine learning, robustness, (16 more...)

arXiv.org Artificial Intelligence

2209.06953

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Maryland > Baltimore (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback